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Description 
Audio File Transmission Method 

Background of Invention 

[0001] FIELD OF INVENTION 

[0002] The present invention relates generally to a method of 
transmitting audio messages over a network, and more 
particularly, a method of emulating voice messaging using 
electronic mail technology. 

[0003] RELATED APPLICATIONS 

[0004] jhis disclosure is a divisional application claiming the 
benefit of the filing date of pending U.S. patent applica- 
tion entitled: "Audio File Transmission Method, "by the 
same inventor, filed on August 31, 2001, bearing Serial 
No. 09/682,431, which is a Continuation-in-Part to U.S. 
patent application Ser. No. 09/517,415 that was filed 
March 2, 2000 and issued as U.S. Patent No. 6,385,306 on 
May 7, 2002. 

[0005] BACKGROUND OF THE INVENTION 



[0006] Electronic mail ("email") has proliferated as a common 
method of communication. Initial communications con- 
sisted of ASCII (American Standard Code for Information 
Interchange) text. In an ASCII file, each alphabetic, nu- 
meric, or special character is represented with a 7-bit bi- 
nary number (a string of seven 0s or Is). 128 possible 
characters are defined. However, basic ASCII text email 
messages have progressed to include graphics, audio and 
even video. Graphic images, digital audio files and digital 
video all require an encoding and decoding process when 
transmitted over the Internet. A user wishing to encode a 
voice message and send the message to a preselected 
email address had to accomplish several steps and have 
certain hardware and software equipment. The user would 
typically record their voice message on a computer using 
a sound card attached or integrated into the motherboard 
of a computer. 

[0007] The voice message is a sequence of analog signals that 
are converted to digital signals by the audio card, using a 
microchip called an analog-to-digital converter (ADC). 
When sound is played, the digital signals are sent to the 
speakers where they are converted back to analog signals 
that generate varied sound. Audio files are usually com- 



pressed for storage or faster transmission. Audio files can 
be sent in short stand-alone segments - for example, as 
files in the WAV format. In order for users to receive 
sound in real-time for a multimedia effect, listening to 
music, or in order to take part in an audio or video con- 
ference, sound must be delivered as streaming sound. 
More advanced audio cards support wavetables, or pre- 
captured tables of sound. The most popular audio file for- 
mat today is MP3 (MPEG-1 Audio Layer-3). 
[0008] Once these digital audio files reside on the hard drive of 
the user, the user would attach the file to an email sent to 
a selected recipient. When the file is attached, it might be 
transmitted in a standardized protocol such as Multi- 
purpose Internet Mail Extensions (herein "MIME"). MIME is 
an extension of the original Internet e-mail protocol that 
lets people use the protocol to exchange different kinds 
of data files on the Internet: audio, video, images, appli- 
cation programs, and other kinds, as well as the ASCII 
handled in the original protocol, the Simple Mail Transport 
Protocol (SMTP). In 1991, Nathan Borenstein of Bellcore 
proposed to the Internet Engineering Task Force that 
SMTP be extended so that Internet (but mainly Web) 
clients and servers could recognize and handle other 



kinds of data than ASCII text. As a result, new file types 
were added to "mail" as a supported Internet Protocol file 
type. 

[0009] Attempts have been made to develop unified messaging 
systems that link video, text, audio, document manage- 
ment and the like into a single system. However, such at- 
tempts have not provided a means to enable voice mes- 
saging to maintain continuity throughout a thread, store 
outbound voice communications, and provide a reply 
function equivalent to the simplicity of email replies. Fur- 
thermore, many unified messaging application require ex- 
pensive proprietary equipment wherein typical SMTP email 
servers are inexpensive, well-developed and already em- 
ployed by most medium or larger business entities. 

[0010] Consequently, there is a need in the art for a method of 
transmitting an audio voice message to an email address 
without the need of a computer. 

[0011] There is a further need in the art for a method of replying 
to an audio voice message without the need to key in a 
telephone number. 

[0012] There is a further need in the art for a method maintain- 
ing a thread of voice mail correspondence. 

[0013] There is a further need in the art for a means to store and 



validate the transmission and content of an outgoing 
voice message. 

[0014] There is a further need in the art for a novel dual email 
and telephone extension identity for user to access voice 
messages from a telephone or computer using a common 
set of alphanumeric characters. 

[0015] There is a further need in the art for a new means of de- 
livering audio advertisements to a captive audience. 

[0016] There is a further need in the art for a new means of ob- 
taining survey and demographic data. 

[0017] There is a further need in the art for a method to provide 
untethered access to voice and email messaging using 
voice command zones. 

[0018] There is a further need in the art for a means of schedul- 
ing the delivery of voice messages to enhance the impact 
on the recipient. 

[0019] However, in view of the prior art in at the time the present 
invention was made, it was not obvious to those of ordi- 
nary skill in the pertinent art how the identified needs 

could be fulfilled. 
Summary of Invention 

[0020] The above and other objects of the invention are achieved 
in the embodiments described herein by providing a com- 



puter implemented method of transmitting electronic 
voice messages comprising the steps of establishing a 
caller identity associated with a first telephone connec- 
tion. This may be achieved by parsing the Telco caller ID 
string and associating that string with a preexisting user 
record. In other words, a call from 727-507-8558 might 
be associated with the identity of a law firm and linked to 
additional records such as address, fax, user name, email 
address and the like. For most applications, it is preferred 
that the caller ID string be linked to the email address of 
the caller in order that the recipient may easily reply to 
the original message. 
[0021] | n the next step, a first audio clip is recorded from the 
first telephone connection. The audio clip may be digi- 
tized in any number of computer readable formats includ- 
ing, but not limited to, WAV, AIFF, MP3 and the like. An 
email target string is then established. This might be re- 
solved by a number of alternative methods. First, a "speed 
dial" interface may be established. The interface might be 
resident in an operating system, network appliance or on 
a website. Users pre-configure their settings similar to 
speed dials used in standard telephone systems wherein 
numerals are associated with the email target string. This 



has the advantage that alphanumeric characters are easily 
entered on a computer keyboard, but often problematic 
on a telephone system. A typical telephone has twelve 
keys which may be depressed in various iterations to re- 
solve an alphabetic character. Alternatively, the individual 
alphanumeric characters may be individually spoken and 
resolved with voice recognition means as disclosed in U.S. 
Patent Application Serial No. 09/517,415 filed March 2, 
2000 which is incorporated by reference. 
[0022] The first audio clip, the caller identity and the email target 
string are all encapsulated into a first email attachment. 
The first email attachment is then transmitted to a first 
email account which is associated with the email target 
string. This is typically an SMTP address accessed directly 
from an information store such as Microsoft Exchange, 
Novell Groupwise or the like. Alternatively, it may be 
stored offsite and accessed through a POP3 means. A sec- 
ond telephone connection accesses the first email attach- 
ment wherein the first audio clip is broadcast to the sec- 
ond telephone connection. Once broadcast, the caller is 
then prompted for a reply recording. Responsive to a 
record signal, a second audio clip from the second tele- 
phone connection is then recorded. The second audio clip, 



the caller identity and the email target string are all en- 
capsulated into a second email attached which is then 
transmitted to a second email account associated with the 
caller identity. 

[0023] | n order to establish a thread of messages having a com- 
mon topic, the first audio clip is appended to the second 
audio clip. The second audio clip might be placed after 
the first audio clip in order to preserve the chronological 
order the thread, or the second audio clip might be placed 
before the first audio clip in order to avoid the need to 
hear previously transmitted audio data. It should be un- 
derstood that the reply loops may continue on far beyond 
one initiating email and a single reply. It is also antici- 
pated that carbon copies, blind carbon copies and for- 
warded be enabled as they are standard SMTP functions. 

[0024] | t j S preferred that the recording time is encapsulated into 
the email attachments with a date/time stamp. This may 
be achieved by encoding the alphanumeric characters of 
the date and time into the text body of the email mes- 
sage. Alternatively, the date and time may be synthesized 
as speech and appended to the audio clip. 

[0025] providing system security includes the steps of establish- 
ing the caller identity associated with the first telephone 



connection by receiving a password entry from the first 
telephone connection and associating the password entry 
against a preexisting caller account. Receiving the pass- 
word entry may include interpreting at least one DTMF 
signal responsive to the keying of buttons on a touch- 
tone telephone. Alternatively, the process may include the 
step of receiving the password entry by interpreting at 
least one alphanumeric character as spoken into the tele- 
phone. In another embodiment, the invention utilizes bio- 
metrics by receiving a predetermined call phrase, reducing 
the call phrase to a voice pattern, and associating the 
voice pattern against a preexisting caller account by 
speech identification means such as described by U.S. 
Patent No. 5,608,784 of which specification is incorpo- 
rated by reference. 
[0026] The email target string may be established by the steps of 
interpreting a plurality of DTMF signals responsive to the 
keying of buttons on a touch-tone telephone. Alterna- 
tively, the step may include receiving a plurality of indi- 
vidually spoken alphanumeric characters representative of 
the email target string and translating the spoken charac- 
ters to their binary equivalent by a speech identification 
means. 



[0027] with sensitive communications, it is preferred that the 
email attachment be encrypted. As an added security 
measure, an addition step may be employed by establish- 
ing a hash of the first email attachment and storing the 
hash in a secure storage means. 

[0028] other embodiment of the invention includes the steps of 
establishing a first SMTP email address, the first SMTP 
email address having a distinct prefix address and a do- 
main address. For example, in a common email address 
format such as "2224848@uspto.gov"the prefix address 
would be "2224848"and the domain address would be 
"@uspto.gov." In the next step a primary telephone num- 
ber is established. This number is preferably a toll-free 
number that is easily remembered with an alphabetic 
phrase correlated to the numerals of the number. For il- 
lustrative purposes, an example number might be 
"1-800-555-EMAIL" In the next step, an extension to the 
primary telephone number is established. The extension 
contains alphanumeric characters identical to the distinct 
prefix address. For example, the above-mentioned email 
address would have an extension of 2224848. The full di- 
aling string to the account would be "1-800-555-EMAIL, 
ext. 2224848." A text-to-speech synthesizer established 



which is responsive to a call to the primary telephone 
number and extension wherein email messages sent to 
the first SMTP address are synthesized into computer- 
generated speech. A voice digitizing means is then estab- 
lished wherein reply messages spoken to the primary 
telephone number and extension are converted into an 
audio computer file and transmitted to a second SMTP ad- 
dress as an email attachment. 
[0029] Callers that wish to retrieve their email from a regular 

telephone may be identified by businesses as a potential 
target for marketing new products and services. Further- 
more, the caller is somewhat of a captive audience as the 
caller is seeking information of personal interest. An al- 
ternative embodiment of the invention includes the step 
of broadcasting a commercial to callers of the first SMTP 
email address. For example, the caller might hear "before 
we retrieve your messages, please listen to a brief mes- 
sage from our sponsor..." The caller might be presented 
with an option to pay for a subscription if they find the 
sponsor messages annoying or may enjoy the service free 
by their exposure to the advertisements. Another step 
may include surveying callers to the primary telephone 
number regarding caller demographics. The demographic 



may include age, gender, occupation, residence and the 
like. Using this information, demographically targeted 
commercials may be broadcast to callers according to 
survey results. To encourage callers to engage in the sur- 
vey questions, an additional step of offering prize incen- 
tives for engaging in survey activities may be included. 
The prize incentives may be long distance telephone cred- 
its and, preferably, the assignment of those credits to a 
preexisting long distance calling card. 
[0030] one objective of the current invention is to simplify the 
task of voice messaging and removing the tether of elec- 
tronic equipment from the user. Accordingly, an alterna- 
tive embodiment of the invention includes the steps of 
establishing a voice command zone. The voice command 
zone is a pre-designated area from which spoken com- 
mands and voice recordings are obtained. The zone may 
be established in an automobile passenger compartment, 
a human-inhabitable dwelling, or the like. An array of 
voice command instructions relating to the operation of 
an electronic message system are established. A micro- 
phone input means receives the voice command instruc- 
tions. Responsive to a record command, an audio mes- 
sage received by the microphone input means is encapsu- 



lated into an email attachment and transmitted to a pre- 
determined SMTP address. A play command broadcasts 
audio files attached to emails through a speaker means. 
Alternatively, a text-to-voice synthesizer synthesizes text 
messages into speech which is then broadcast through 
the speaker means responsive to the play command. For 
zones that have low ambient noise, an omni directional 
microphone is appropriate. However, where voice com- 
mands from a specific individual are desired, an alterna- 
tive embodiment of the invention includes a target loca- 
tion sensor means. This may include an RF or IR transmit- 
ter placed on the individual. A target acquisition means 
picks up the RF or IR broadcast and points a unidirectional 
microphone input means towards the individual thereby 
avoiding extraneous audio noise. 
[0031] | n y e t another embodiment of the invention, a caller iden- 
tity is established with a telephone connection and an au- 
dio clip is recorded from the first telephone connection. 
An email target string is established. The audio clip, caller 
identity and email target string are encapsulated into an 
email attachment. The email attachment is then transmit- 
ted to an email account associated with the email target 
string and the email attachment is also stored in a sent 



items repository. This embodiment of the invention serves 
a critical function, particularly in the business and legal 
communities of providing a record of outbound commu- 
nication. In the prior art, one business person may leave a 
message on a voice mail system, but has no means to 
prove that message was left, much less the actual content 
of that message. The ability to produce a record of out- 
bound communication from one business to another 
serves the function to validate important communications 
and messages were actually transmitted. It is preferred 
that along with the email attachment, the email target 
string and a date-time stamp string are stored in associa- 
tion with the email attachment. 
[0032] | n an alternative embodiment of the invention, a caller 

may wish to have a communication delivered at a prede- 
termined time. Email communications are substantially in- 
stantaneous. Therefore, it might be known that an in- 
tended recipient may not be current available due to work 
schedule, time zone differences or the like. If the commu- 
nication is immediately transmitted, the recipient may find 
it buried below more recent communications. Further- 
more, many message system provide immediate feedback 
with sound or an interface display when a new message is 



received. Accordingly, it would be advantageous to pro- 
vide the ability to schedule the delivery of the communi- 
cation. In this embodiment of the invention, a caller iden- 
tity is associated with a telephone connection. An audio 
clip is recorded from the telephone connection. An email 
target string is established. The audio clip, caller identity 
and email target string are encapsulated into an email at- 
tachment. A broadcast time is established. The email at- 
tachment is then held in a queue until the broadcast time 
is reached wherein the email attachment is transmitted to 
an email account associated with the email target string. 
The email target string may be associated with a time 
zone associated with the physical location of the intended 
recipient. The broadcast time then may be automatically 
calculated relative to the time zone. 

[0033] Accordingly, it is an object of the present invention to 

provide a method of transmitting an audio voice message 
to an email address without the need of a computer. 

[0034] | t j S another object of the present invention to provide a 
method of replying to an audio voice message without the 
need to key in a telephone number. 

[0035] it is another object of the present invention to provide a 
method maintaining a thread of voice mail correspon- 



dence. 

[0036] it is another object of the present invention to provide a 
means to store and validate the transmission and content 
of an outgoing voice message. 

[0037] | t j S another object of the present invention to provide a 
novel dual email and telephone extension identity for user 
to access voice messages from a telephone or computer 
using a common set of alphanumeric characters. 

[0038] it is another object of the present invention to provide a 

new means of delivering audio advertisements to a captive 
audience. 

[0039] it is another object of the present invention to provide a 
new means of obtaining survey and demographic data. 

[0040] it is another object of the present invention to provide un- 
tethered access to voice and email messaging using voice 
command zones. 

[0041] n is another object of the present invention to provide a 
means of scheduling the delivery of voice messages to 
enhance the impact on the recipient. 

[0042] A n advantage of the invention is that it incorporates well- 
developed SMTP technology to deliver voice messages 
globally without incurring long distance charges. 

[0043] Another advantage of the invention is that those wishing 



to send an audio voice message do not need to have any 
computer equipment. 
[0044] Another advantage of the invention is that recipients of 

voice messages no longer are required to find and dial the 
number of the originator, the recipient can easily reply to 
the voice message with a single click, button or spoken 
command. 

[0045] Another advantage of the invention is that the context of a 
voice message discussion may be maintained as a thread. 
Past comments may be accessed and referenced as 
needed. 

[0046] Another advantage of the invention is that outgoing voice 
messages may be easily saved in virtually any computer 
readable medium. Proprietary voice message systems are 
not required. 

[0047] Another advantage of the invention is that businesses may 
subrogate the costs of the messaging system infrastruc- 
ture with demographically targeted advertising. Users un- 
able to subscribe to the service can still have access paid 
for by advertising businesses. 

[0048] Another advantage of the invention is that delivery of 

voice messages may be coordinated to coincide with the 
schedule of the recipient so that the voice message is de- 



livered at the optimum time. 
[0049] These and other important objects, advantages, and fea- 
tures of the invention will become clear as this description 
proceeds. 

[0050] The invention accordingly comprises the features of con- 
struction, combination of elements, and arrangement of 
parts that will be exemplified in the description set forth 
hereinafter and the scope of the invention will be indi- 
cated in the claims. 
Brief Description of Drawings 

[0051] For a fuller understanding of the nature and objects of the 
invention, reference should be made to the following de- 
tailed description, taken in connection with the accompa- 
nying drawings, in which: 

[0052] piQ m i is a schematic view of the invention as generally dis- 
closed. 

[0053] piQ m 2 is a schematic view of an embodiment of the inven- 
tion using a common SMTP and telephone extension 
string. 

[0054] piq 3 is a schematic view of an embodiment of the inven- 
tion incorporating demographically targeted advertising 
responsive to survey data. 

[0055] pjQ m 4 is a schematic view of an embodiment of the inven- 



tion for hands-free voice messaging. 

[0056] piq_ 5 is a schematic view of an embodiment of the inven- 
tion for saving outbound voice messages. 

[0057] piq ^ is a schematic view of an embodiment of the inven- 
tion featuring scheduled message delivery. 
Detailed Description 

[0058] Referring initially to FIG. 1, it will there be seen that an il- 
lustrative embodiment of the prior is denoted by the ref- 
erence number 10 as a whole. A first telephone connection 
20 is made. The caller identity is establish 30 and a first 
audio clip is recorded 40. The caller identity may be es- 
tablished by a plurality of means 35 which include, but are 
not limited to, DTMF entries, voice recognition of entries, 
or identification of a preexisting voice pattern all associ- 
ated with a preexisting user account. An email target 
string is established 50. The email target string may be 
established by DTMF entries, speed dial function or voice 
recognition 60. The caller identity, audio clip and email 
target string are encapsulated into a first email attach- 
ment 70. Preferably, the attachment is encrypted 80 and a 
hash is generated. The attachment is transmitted to a first 
email account 90. Responsive to a second telephone con- 
nection 100, the first email account is accessed and the 



audio clip is played no over the second telephone con- 
nection. The caller is prompted for a reply recording 120 
and a second audio clip is recorded 130. The second audio 
clip is encapsulated into a second email attachment 140 
and optionally the first audio clip is appended 150 to es- 
tablish a thread of related messages. The second email 
attachment is then transmitted to a second email account 
160 associated 170 with the original caller identity 30. 

[0059] FIG. 2 shows an embodiment of the invention that pro- 
vides a common identity for both telephone and email 
communications while allowing access from either 
medium. An SMTP address is established 180 which con- 
tains an alphanumeric character string common to a tele- 
phone extension 200 established off a primary telephone 
number 190. Preferably, the alphanumeric characters are 
all integers, and thus easily entered on a touch-tone tele- 
phone. A text-to-speech synthesis means 210 reads text 
messages over the telephone and responses are digitized 
220 for outgoing messaging over the telephone connec- 
tion. Users are able to access the email account through 
standard POP3 mail connections as well. 

[0060] FIG. 3 illustrates an embodiment of the invention wherein 
a call is made to the extension 230 and a commercial is 



broadcast 240 to the caller. This provides a means of pay- 
ing for the infrastructure of the telecommunications sys- 
tem without necessarily requiring a subscription to the 
service. Surveys may be initiated 250 which can then de- 
mographically target 260 the most appropriate commercial 
broadcast 240 according to the characteristics of the indi- 
vidual caller. Survey prize incentives 270 may be offered to 
encourage active participation in the surveys. A means of 
distributing the dual SMTP and extension information may 
be through the use of calling cards. These calling cards 
are traditionally used for long distance service credits that 
are expended as the user makes calls. Login passwords 
for the SMTP address and/or PINs for accessing the sys- 
tem via telephone may be secured as scratch-off areas on 
the physical card itself. Alternatively, the cards may be 
wrapped or covered with an opaque material that indi- 
cates if the material has been tampered with. Such privacy 
sealing means are well known and often employed for lot- 
tery tickets and sweepstakes prizes. Long distance service 
may be employed as an incentive for submitting to sur- 
veys 280 which can then be used to encourage more use 
of the service 290. 
[0061] | n FIG. 4, a voice command zone is established 300. This 



zone may be the interior of a vehicle 310 a hospital room 
320, or any type of human inhabitable dwellings. An array 
of voice command instructions 330 are established and re- 
ceived by a microphone means 340. The microphones 350 
may comprise two general types: omni directional 360 and 
unidirectional 370. In the case of omni directional micro- 
phones 360 sounds from in a 360 degree sweep are 
recorded. However, in the case of unidirectional micro- 
phones 370 a single point is recorded. Accordingly, an RF 
or IR transmitter may be placed on the person 380 sought 
to be recorded and an RF or IR acquisition means 390 dy- 
namically tracks the location of the emitter 380 and points 
to the unidirectional microphone 370 towards the appro- 
priate target. A voice command initiates a corresponding 
SMTP command 400 to generate a new message or reply 
to an existing message. Audio recording is initiated re- 
sponsive to a record command 410 and the audio record- 
ing is encapsulated 420 into an email attachment which is 
then transmitted 430 to a predetermined email address. 
[0062] | n FIG. 5, a caller identity is established 30. An audio clip 
is recorded 40 and an email target string is established 50. 
The caller identity, audio clip and target string are encap- 
sulated 70 into an email attachment along with a date/ 



time stamp 440. The email attachment is then transmitted 
90 to an email account associated with the email target 
string. In addition, a copy of the email attachment is 
stored 450 in a sent items repository. 
[0063] | n FIG. 6, a caller identity is established 30. An audio clip 
is recorded 40 and an email target string is established 50. 
A broadcast time is established 460. This time may be re- 
solved automatically by calculating the time zone differ- 
ence470 between the sender and recipient of the message. 
The caller identity, audio clip and email target string are 
all encapsulated into an email attachment 70 and then 
transmitted to a predetermined email address 90 associ- 
ated with the email target string at the established broad- 
cast time. 

[0064] | t w j|| De seen t hat the objects set forth above, and those 
made apparent from the foregoing description, are effi- 
ciently attained and since certain changes may be made in 
the above construction without departing from the scope 
of the invention, it is intended that all matters contained 
in the foregoing description or shown in the accompany- 
ing drawings shall be interpreted as illustrative and not in 
a limiting sense. 

[0065] n is also to be understood that the following claims are 



intended to cover all of the generic and specific features 
of the invention herein described, and all statements of 
the scope of the invention which, as a matter of language, 
might be said to fall therebetween. Now that the invention 
has been described, 



